Goto

Collaborating Authors

 homoscedastic model


H-AddiVortes: Heteroscedastic (Bayesian) Additive Voronoi Tessellations

arXiv.org Machine Learning

This paper introduces the Heteroscedastic AddiVortes model, a Bayesian non-parametric regression framework that simultaneously models the conditional mean and variance of a response variable using adaptive Voronoi tessellations. By employing a sum-of-tessellations approach for the mean and a product-of-tessellations approach for the variance, the model provides a flexible and interpretable means to capture complex, predictor-dependent relationships and heteroscedastic patterns in data. This dual-layer representation enables precise inference, even in high-dimensional settings, while maintaining computational feasibility through efficient Markov Chain Monte Carlo (MCMC) sampling and conjugate prior structures. We illustrate the model's capability through both simulated and real-world datasets, demonstrating its ability to capture nuanced variance structures, provide reliable predictive uncertainty quantification, and highlight key predictors influencing both the mean response and its variability. Empirical results show that the Heteroscedastic AddiVortes model offers a substantial improvement in capturing distributional properties compared to both homoscedastic and heteroscedastic alternatives, making it a robust tool for complex regression problems in various applied settings.


Correlated Input-Dependent Label Noise in Large-Scale Image Classification

arXiv.org Machine Learning

Large scale image classification datasets often contain noisy labels. We take a principled probabilistic approach to modelling input-dependent, also known as heteroscedastic, label noise in these datasets. We place a multivariate Normal distributed latent variable on the final hidden layer of a neural network classifier. The covariance matrix of this latent variable, models the aleatoric uncertainty due to label noise. We demonstrate that the learned covariance structure captures known sources of label noise between semantically similar and co-occurring classes. Compared to standard neural network training and other baselines, we show significantly improved accuracy on Imagenet ILSVRC 2012 79.3% (+2.6%), Imagenet-21k 47.0% (+1.1%) and JFT 64.7% (+1.6%). We set a new state-of-the-art result on WebVision 1.0 with 76.6% top-1 accuracy. These datasets range from over 1M to over 300M training examples and from 1k classes to more than 21k classes. Our method is simple to use, and we provide an implementation that is a drop-in replacement for the final fully-connected layer in a deep classifier.


Analysis of Softmax Approximation for Deep Classifiers under Input-Dependent Label Noise

arXiv.org Machine Learning

Modelling uncertainty arising from input-dependent label noise is an increasingly important problem. A state-of-the-art approach for classification [Kendall and Gal, 2017] places a normal distribution over the softmax logits, where the mean and variance of this distribution are learned functions of the inputs. This approach achieves impressive empirical performance but lacks theoretical justification. We show that this model is a special case of a well known and theoretically understood model studied in econometrics. Under this view the softmax over the logit distribution is a smooth approximation to an argmax, where the approximation is exact in the zero temperature limit. We further illustrate that the softmax temperature controls a bias-variance trade-off and the optimal point on this trade-off is not always found at 1.0. By tuning the softmax temperature, we achieve improved performance on well known image classification benchmarks with controlled label noise. For image segmentation, where input-dependent label noise naturally arises, we show that tuning the temperature increases the mean IoU on the PASCAL VOC and Cityscapes datasets by more than 1% over the state-of-the-art model and a strong baseline that does not model this noise source.


Predicting Surgery Duration with Neural Heteroscedastic Regression

arXiv.org Machine Learning

Scheduling surgeries is a challenging task due to the fundamental uncertainty of the clinical environment, as well as the risks and costs associated with under- and over-booking. We investigate neural regression algorithms to estimate the parameters of surgery case durations, focusing on the issue of heteroscedasticity. We seek to simultaneously estimate the duration of each surgery, as well as a surgery-specific notion of our uncertainty about its duration. Estimating this uncertainty can lead to more nuanced and effective scheduling strategies, as we are able to schedule surgeries more efficiently while allowing an informed and case-specific margin of error. Using surgery records %from the UC San Diego Health System, from a large United States health system we demonstrate potential improvements on the order of 20% (in terms of minutes overbooked) compared to current scheduling techniques. Moreover, we demonstrate that surgery durations are indeed heteroscedastic. We show that models that estimate case-specific uncertainty better fit the data (log likelihood). Additionally, we show that the heteroscedastic predictions can more optimally trade off between over and under-booking minutes, especially when idle minutes and scheduling collisions confer disparate costs.